Production of heterologous peptides

ABSTRACT

The invention relates to a method and recombinant means for engineering the production of heterologous peptides in filamentous fungus. The invention involves the genetic manipulation of the glucoamylase gene in order to provide a restriction site in same so that the promoter sequence of the glucoamylase gene can be coupled to a sequence encoding a heterologous peptide whereby the production of the peptide will be under the control of the promoter sequence of the glucoamylase gene.

The invention relates to a method and recombinant means particularly,but not exclusively, expression cassettes and expression/exportcassettes for the production of heterologous peptides. The method andmeans have particular application in the production of such peptidesfrom the biotechnological exploitation of filamentous fungi andparticularly Neurospora crassa.

The filamentous fungi secrete substantial amounts of protein, notablyhydrolytic enzymes. Many of these enzymes are used in industrialprocesses such as the production of antibiotics and organic acids, thesaccharification of starch, glucose isomerisation, the processing ofwines and fruit juices and the degradation of cellulose and lignin(Bennett 1985; Bu'Lock and Kristiansen 1987). The promoter and signalsequences of the genes of such enzymes represent targets formanipulation for developing the filamentous fungi as hosts forheterologous gene expression. The potential for this technology has beenreviewed with particular reference to the genus Aspergillus (Van denHondel et al 1991).

The term heterologous gene expression is used in this document to meanthe expression of genes not present or common in the host.

The genus Neurospora has several advantages for study with a view to itspossible exploitation as a host for heterologous gene expression.

More specifically, the species Neurospora crassa is the most thoroughlystudied and characterised of all the filamentous fungi (Reviewed byPerkins DD 1992! genetics 130:687-700), with more genes characterisedand more genes cloned than any other species. It is extremelyfast-growing, with simple growth requirement. It will grow on a widerange of carbon and nitrogen sources, and has a single complex growthrequirement, for biotin. It will grow in liquid or on solid medium. Itproduces no toxic secondary metabolities, and in fact is a traditionaloriental human food organism.

Neurospora in nature grows in a solid medium, and has efficient secretedenzyme systems for the utilisation of polysaccharide carbon sources.These include glucoamylase, the major exported protein whenstarch-induced. Secreted proteins in wide-type Neurospora reach levelsof circa 1 g/l of spent medium, the glucoamylase, when starch-induced,accounts for circa 20% of the total. For glucoamylase, there is evidencefor two regulatory components, carbon catabolite repression, andinduction by the substrate or some partial hydrolysis product of such.

Glucoamylases have been cloned and characterised from several fungi:Aspergillus awamori (Nunberg et al 1984), A. awamonri var. kawachi(Hayashida et al 1989) A. niger (Boel et al 1984), A oryzae (Hata et al1991), A. shirousami (Shibuya et al 1990), Humicola grisea var.thermoidea (Berka et al, personal communication), Rhizopus oryzae(Ashikari et al 1986), Saccharaomyces cerevisiae (Pardo et al 1988), S.diastaticus (Yamashita et al 1985), S. fibuligera (d, (Itoh et al 1987),and S. occidentalis (Dohmen et al 1990).

Glucoamylases (exo-1,4-x-D-glucan glucohydrolase, EC 3.2.1.3) aresecreted in large amounts by a variety of filamentous fungi. Theycatalyse the removal of single glucose units from the non-reducing endsof starch and other poly- and oligo-saccharides. Their use in industrialprocesses includes the production of glucose syrups from starch (Kennedyet al 1988), and the fermentation of sake, (rice wine) in Japan.Heterologous expression systems in the above Aspergillus species offilamentous fungi commonly use their glucoamylase promoters to driveexpression, their signal sequences to secrete foreign peptides, andtheir 3' flanking regions to direct termination (Archer et al 1990; Wardet at 1990, 1992).

Koh-Luar et al (1989) analysed culture supernatans of Neurospora crassa,growing on a variety of carbon sources, and showed that the proteinpresent in the largest amount was a glucoamylase of approximately 69kDa. This protein was purified and the N-terminal sequence of theglucoamylase determined.

The high expression and secretion properties of the Glucoamylase genemakes it an attractive candidate for use in heterologous geneexpression. The glucoamylase promoter can be used independent of theglucoamylase open reading frame so exploiting the promoter's hightranscription levels as well as the regulation in response toextracellular carbon. The highly secreted open reading frame of theglucoamylase gene can be used in conjunction with the glucoamylasepromoter to target foreign proteins into the secretory pathway ofNeurospora crassa. In this case, the entire open reading frame or aportion of the glucoamylase gene can be attached in frame to the foreigngene.

Here we report the DNA sequence of the glucoamylase gene, gla-1, ofNeurospora crassa together with flanking sequences and compare itsamino-acid sequence with other glucoamylases.

Having obtained the DNA sequence structure of the aforementioned gene,we have characterised an unexpectedly very high level, regulatedpromoter, and determined key features of its carbon cataboliterepression and its induction by a polysaccharide substrate such asstarch or metabolites thereof. With this information, we are in aposition to genetically engineer expression cassettes andexpression/export cassettes containing this high level regulatedpromoter along with any other pre-selected peptide. The control ofproduction of this peptide is in accordance with the repression andinduction features of the promoter. Thus, we can selectively control theproduction of the peptide according to the presence or absence of carboncatabolite.

It is apparent that this technology has great significance in thegenetic engineering industry because it enables selective production ofa pre-determined peptide in am extremely efficient and cost effectiveway without the production of secondary metabolites. Further, sinceNeurospora, like other filamentous ascomycete fungi but unlike yeasts,tends to glycosylate proteins in a way resembling that of mammals, thereis reasonable expectation that any heterologously produced mammalianpeptide sequences requiring glycosylation for biological activity willin fact be biologically active.

In addition, Neurospora, can be transformed at high efficiency, withtransformed sequences being integrated, at least vegetatively stably,generally into heterologous locations in the genome.

Further, we also report here modifications of the glucoamylase codingregion which improves the gene's utility as a vector for heterologousgene expression.

According to a first aspect of the invention there is therefore provideda regulated promoter having the DNA sequence structure shown in FIG. 1,or part thereof, of a functionally equivalent nucleotide sequence.

According to a second aspect of the invention there is provided aregulated promoter and an upstream activator having the DNA sequencestructure shown in FIG. 1, or part thereof, and especially having thesequence structure shown in the first one thousand nucleotides of theDNA sequence structure shown in FIG. 1, or part thereof.

Preferably the DNA sequence structure shown in FIG. 1 encodes a proteinthe amino acid sequence of which is depicted in FIG. 1 or a protein ofequivalent biological activity, having substantially the amino acidsequence depicted in FIG. 1.

It follows that since the DNA sequence structure shown in FIG. 1 encodesthe protein glucoamylase then the regulated promoter of the invention isthe promoter controlling the expression of the glucoamylase gene.

According to a third aspect of the invention there is provided aregulated promoter as aforedescribed which is further provided withlinkers whereby ligation of the promoter with a pre-selected geneencoding a desired protein is facilitated.

According to a yet further aspect of the invention there is provided avector or plasmid (pPS8) incorporating the aforementioned DNA sequencestructure.

Preferably said vector or plasmid incorporates a 904 nucleotide fragmentof said DNA sequence structure located between a BamH1 site atnucleotide 98 and a HindIII site at nucleotide 1002.

According to a yet further aspect of the invention there is provided anexpression cassette including at least the aforementioned regulatedpromoter DNA sequence and a pre-selected gene encoding a heterologouspeptide.

Preferably the expression cassette also includes the upstream activatorsequence.

Preferably further still said expression cassette includes a markerselectable in Neurospora which ideally is a gene encodinghygromycin-resistance.

Preferably further still said expression cassette contains a replicationorigin from, ideally, E. coli and preferably also an E. coli-selectablemarker, for example a gene encoding ampicillin-resistance.

Preferably further still said expression cassette incorporates amulti-cloning site whereby the insertion of any pre-selected genesequence can be incorporated via transcriptional fusion.

According to a yet further aspect of the invention there is provided anexpression/export cassette which incorporates any one or combination ofthe aforementioned expression features and which further incorporates aDNA sequence structure encoding a secretion signal.

Preferably said expression/export cassette contains the aforementionedDNA sequence translationally fused to the coding sequence for theheterologous peptide.

Preferably three different expression/export cassettes are provided. Themultiple cloning site oligonucleotide is in a different reading frame ineach to permit in-frame translational fusion to the coding sequence forthe heterologous peptide. This is achieved by appropriate design of theends of the synthetic multiple cloning site oligonucleotide.

It will be apparent to those skilled in the art that the provision of anexpression/export cassette enables a heterologous peptide to be bothexpressed and then exported into culture medium, however, this limitsthe range of peptides which can be made but the advantage is that itfacilitates the purification of those peptides that can be made usingthis method.

In preferred embodiments of the invention the selected heterologouspeptide is a medical or pharmaceutical peptide such as insulin, humangrowth hormone, interleukin or indeed any other suitable peptide.

According to a yet further aspect of the invention there is providedexpressions cassettes and/or expression/export cassettes, also referredto as constructs or plasmids as described hereinafter for enabling theworking of the invention.

In a preferred embodiment of the invention there is provided the plasmidpGla-Xho I as illustrated in FIG. 3.

In yet a further embodiment of the invention there is provided theplasmid Gla-Mro as illustrated in FIG. 5.

In yet a further preferred embodiment of the invention there is providedthe plasmid pGE as illustrated in FIG. 6.

In yet a preferred embodiment of the invention there is provided theplasmid pGS as illustrated in FIG. 8.

Further preferred constructs or plasmids of the invention include thoseplasmids which could be termed intermediary and which are used either inisolation or combination to provide the abovementioned plasmids, forexample, intermediary constructs include the plasmids pGla XL-,pGIA XLX,and pGla MXL used to manufacture the construct pGE and otherintermediary constructs include pGla XhoI used to manufacture constructpGS.

According to a yet further aspect of the invention there is providedprimers for manufacturing the constructs or plasmids hereindescribedwhich primers are shown in FIG. 4.

It will apparent to those skilled in the art that shorter primersequences may be used in order to work the invention or thatsubstitutions of one or more bases within the primers described in FIG.4 may be used providing hybridisation can be achieved. Thus it followsthat shorter primer sequences or sequences with minor internal basesubstitutions may be used to work the invention and can easily be testedfor this purpose using the methods described herein.

According to a yet further aspect of the invention there is provided amethod for transforming filamentous fungus comprising the insertion ofat least one of the aforementioned expression cassettes and/orexport/expression cassettes into same using recombinant techniques.

In a preferred embodiment of the invention said filamentous fungus isNeurospora crassa.

According to a yet further aspect of the invention there is provided afilamentous fungus including at least one expression cassette and/orexpression/export cassette according to the invention.

Preferably said filamentous fungus is Neurospora crassa.

According to a yet further aspect of the invention there is provided amethod fox the production of a pre-selected heterologous peptide from atleast one filamentous fungus comprising:

a) providing either an expression cassette or an expression/exportcassette as aforedescribed.

b) transforming a pre-selected species of filamentous fungus with atleast one of said cassettes.

c) culturing said transformed fungus; and

d) harvesting said heterologous peptide.

The invention will now be described, by way of example only, withreference to the following figures wherein.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 SEQ ID NO:6 shows the DNA sequence structure of the glucoamylasegene and the corresponding amino acid sequence structure of the proteinglucoamylase. The glucoamylase reading frame is shown in upper case,together with its translation below. Untranslated regions are shown inlower case. The numbering for the nucleotides is based on the A of theATG being +1. The numbering for the amino acids are in brackets. Theputative promoter elements are underlined in bold. The functionaldomains of the intron are in bold. The leader sequence of the protein isin bold, with the signal splice shown as an arrow. The Lys-Arg (Kex2)propeptide processing sites are underlined. The putative polyadenylationsignal is also underlined in bold.

FIG. 2 is an illustration of plasmid pPS8.

FIG. 3 SEQ ID NO: 3 shows the DNA sequence structure of the plasmidpGla-Xho I.

FIG. 4 shows SEQ ID NO:8, 10, 11, 12, 13 the primers used for thecreation of the Mro I site at the final codon of the glucoamylase gene.

FIG. 5 SEQ ID NO:5 shows the DNA sequence structure of the plasmidpGla-Mro.

FIG. 6 SEQ ID NO:2 shows the DNA sequence structure of the plasmid pGE.

FIG. 7 SEQ ID NO:4,7 shows the relative location of the Sal I site tothe second Kex-2 site in the glucoamylase gene.

FIG. 8 SEQ ID NO:1 shows the DNA sequence structure of the plasmid pGS;and

FIG. 9 shows the highest yields of DSPA from all the expression vectorstested.

CLONING OF THE GLUCOAMYLASE GENE

The Neurospora glucoamylase gene gla-1 was cloned and sequenced byconventional methods such as sequence alignment of the gene from otherspecies, design of nested PCR primers, and production of a fragment byPCR which was used to identify a genomic clone from a Neurospora genomiclibrary in the vector lambda J1. The clone was sub-cloned intopBluescript, and sequenced by the Sanger-dideoxy method.

The genes encodes the deduced protein of 626 amino acids, withunglycosylated molecular weight of 66, 575 Da. This includes a leaderpeptide of 35 amino acids when compared to the known N-terminus of thesecreted protein.

We have sequenced 938 base pairs upstream of the translation initiationcodon. There is a TATA box at position -101 with respect to the ATGcodon. The actual sequence is TATATAA and the eukaryotic consensusTATA(A/T)TA. There are several potential, although no perfect, CAATboxes upstream of the TATA box, the most likely one to function being at-133 to the ATG start codon (CATCAATAT). The eukaryotic consensussequence is GG(C/T)CAATCT.

The initiation points of translation have been shown to have a verystrong requirement for a purine at position -3 with respect to theinitiating AUG.

Isolation of the Essential Promoter Region

The promoter, regulatory regions, and probably also any UAS (upstreamactivator sequences) are contained in the first 938 nucleotides of thedetermined sequence. This, together with the rest of the clone, iscontained in our plasmid pPS8. A 904 nucleotide fragment between a BamH1site at nucleotide 98 and a HindIII site at nucleotide 1002 containingthe major part of the promoter, and the N-terminal major part of thesignal peptide of the gene product may be readily sub-cloned and testedfor promoter and starch-regulatory activity with a suitable reportergene.

A suitable restriction site at nucleotide 1002 was identified in theN-terminal part of the open reading frame of the gene, and suitableconstructs were made using a reporter gene so as to study promoteractivity and regulator functions as described below.

Plasmid pPS8

FIG. 2 is a diagram of the pPS8 plasmid. The circa 2 kb portion in theupper left between the two ClaI sites, indicated by the double line, isthe vector, pbluescript. The construct results from cleaving pbluescriptat the single CLaI site, in its multiple cloning site and inserting theNeurospora ClaI fragment containing the glucoamylase gene. The outer arclabelled gla gene indicates the approximate position of the codingregion of the gene, extending from N- to C-terminus. The promoter regionis in the circa 1 kb upstream region between the ClaI site at 12 o'clockand the N-terminus of the coding region. The numbers indicate theapproximate sizes of restriction fragments in nucleotide pairs. Therestriction sites for BamH1, HindIII and EcoR1 are shown in theNeurospora insert for reference.

Choice of Reporter Gene

Two obvious choices of reporter gene exist. The first of these is thewell-characterised GUS (B-glucuronidase) reporter gene available in theplasmid pNom123. This has the hph hygromycin-resistance gene as itsNeurospora-selectable marker. An Alternative reporter gene is theNeurospora tyr tyrosinase construct pTry103.

Isolation of the Essential Sequence of the Promoter

Certain promoter features have already been identified by sequencehomology. These include a putative CAAT box at nucleotide 804-812(actual sequence CATCAATAT) and a TATA box at nucleotide 838-844 (actualsequence TATATAA), because of their resemblance to consensus sequencesfor these promoter features. Another feature identified by homology withthe promoter sequence of the Aspergillus a-amylase, is the regionnucleotide 301-340 of the Neurospora gla-1 sequence, with circa 75%sequence homology. This may be a UAS, or other essential feature. Twotranscription origins have also been identified by primer extension, atnucleotide 885 and at nucleotide 892.

Experimental investigation of the limits of the essential promoter wereundertaken by the cleavage of the sub-cloned promoter-reporter geneconstruct, and the deletion in from the 5'-end of the sub-clone. Thisinvolves either deletion of specific restriction fragments, subject toavailable restriction sites, or exonuclease degradation. In either case,the shortened "promoter" is religated into the reported construct andtested for residual promoter activity and regulation.

Experimental investigation of the limits of the essential promoter wereundertaken by the cleavage of the sub-cloned promoter-reporter geneconstruct and the deletion in from the 5'-end of the sub-clone. This wasdone using mung bean exonuclease digestion. Alternatively, it could bedone using any suitable restriction sites so as to provide a nested setof deletions. These deletions, or shortened promoter sequences, werereligated into a reporter construct and tested for residual promoteractivity and regulation.

Construction of an Expression Cassette

Deposits of plasmids pGLA-Xho, pGLA-Mro I, pGE have been made and thedeposition details are as follows: pGLA-Xho1, ATCC depositiondesignation 75858; pGLA-Mro 1, ATCC deposition designation 75859; andpGE, ATCC deposition designation 75860.

All DNA modifying enzymes were bought from Boehringer Mannheim of NewEngland Biolab. Plasmids were transformed into DH-5a,E.coli cells.

Modifications of the glucoamylase gene for use in targeting foreignprotein into the endoplasmic reticulum required the insertion of aconvenient restriction site in the open reading frame of theglucoamylase gene, and more preferably the restriction site was locatedat the last codon of the open reading frame of the glucoamylase gene.The restriction site tggcca, recognised by the restriction enzymes Mro Isold by Boehringer Mannheim and BspE I sold by New England Biolabs was,in the first instance, placed at the last codon of the glucoamylasegene. The restriction site tggcca will be referred to hereinafter as MroI. In order to engineer the Mro I site, PCR primers were created. The 5'primer encompassed the unique Ppum I site at position 2163 of theglucoamylase open reading frame (see FIG. 3 SEQ ID NO:3). The 30 primercontaining an Mro I site hybridizes at the 3' end of the gla gene (seeFIG. 4 SEQ ID NO:12, 13).

The glucoamylase with the Mro I site was created in a two stepprocedure.

1. The 5'upstream PCR fragment was amplified and cloned into the Sma Isite in a pNEB 193 vector. The PCR fragment was orientated so the 5'Ppum I site was proximal to the Eco RI site on the polylinker of pNEB193. The proper clone was named pMro.

pMro: Eco Sac, Ppum I PCR Mro I Asc I, Xba I, Hind III

2. Next, the remainder of the gLa gene was inserted by digestion of theglucoamylase clone pGla-Xho I, this plasmid pGla-Xho I, contains theentire gla gene however, the downstream unsequenced and non-transcribedarea was deleted. pGla-Xho I was digested with the restriction enzymesSac I and Ppum I. This fragment was ligated into the Sac I/Ppum I sitesof pMro I. The Sac I site of pGla-Xho I was derived from the linker andnot from the coding region of glucoamylase consequently, no glucoamylasesequence was deleted (see FIG. 5).

pGla-Mro: Sac, Gla I Ppum I Mro, Asc I ect.

The glucoamylase gene transcribes a message of 1943. bases, not includedthe poly-adenylation sites. The expression construct when fused to aCDNA to be expressed, will require the transcription of a longermessage. The larger open reading frame may be transcribed lessefficiently than the original, shorter construct. In an attempt toincrease transcription efficiency we deleted 1575 bp from theglucoamylase open reading frame, creating the plasmid pGE: (plasmidGlucoamylase, Eco RI see FIG. 6 SEQ ID NO:2).

1. The construct, pGla-Xho was digested with Cla I and Xba I and thesticky ends made blunt with E.coli polymerase I Klenow fragment) toremove the 5' polylinker. The DNA was then recircularised andtransformed into competent E.coli cell DH 5a. We named this constructpGa XL-.

2. We next added a Xba I linker at the Bsa Al site in the glucoamylasegene. The Bsa AI site at position 3542 in the construct pGla XL- isthirteen base pairs away from the termination codon of glucoamylase.Complete digestion with Bsa AI would produce three fragments,consequently, a partial digest of pGla XL- with Bsa AI was performed. Alinearised band corresponding to the size of pGla XL- was gel purified.Added to the gel purified fragment was 200 ng of an 8 bp Xba I linkerand 400 units of T4 ligase. Clones were screened for insertion of theXba I linker into the Bsa AI site at position 3542. Properly identifiedclones were renamed pGla XLX.

3. The clone pMro (described above) was digested with Ppum I and Xba I.the released fragment was ligated into pGla XLX at the Ppum I/Xba Isites. The correct clone was identified by restriction digest andrenamed pGLA MAX.

4. The clone pGla MXLX was then digested with Eco RI and Mro I releasing1575 bp of the glucoamylase open reading frame. The sticky ends weremade blunt by filling in with E.coli polymerase I (Klenow fragment). Thenew expression construct was renamed pGE (plasmid Glucoamylase, Eco RI,see FIG. 6 SEQ ID NO:2). This construct contains the fusion site forglucoamylase targeting at the Eco RI site in the glucoamylase gene, 359bp from the glucoamylase start codon

We made a second glucoamylase truncated fusion expression cassette. Thisconstruct contains the fusion junction at the first Sal I site in theglucoamylase open reading frame, as position 1133 in pGla XhoI (see FIG.1 SEQ ID NO:6). The Sal I site was chosen because it occurs immediatelyafter the second kex-2 site at Lys₃₄ -Arg₃₅ (see FIG. 5). Kex-2 sitesare found in many fungal systems as proteolytic cleavage sites forremoval of propeptides (reviewed in Stone et al 1993).

1. The clone pGla MXLX was digested with the restriction enzymes Sal Iand Mro I. The sticky ends were made blunt with E.coli DNA polymerase I(Klenow fragment). The digested DNA was ligated together with 400 unitsof T4 DNA ligase. The proper clone was identified by restrictionanalysis. The clone was named pGS (plasmid Glucoamylase, Sal I), seeFIG. 8.

Transformation into Neurospora

Standard transformation methodology was used to effect thetransformation of DNA constructs into Neurospora spheroplasts, using thecell wall-degrading enzyme Novozym 234 (Radford et al 1981! Molec GeneGenet 184, 567-569).

DSPA Production

The glucoamylase expression vectors were used to express a mammalianthrombolytic protein secreted from the salivary glands of the vampirebat Desmonas sauvaris. The protein DSPA (Desmonas salivaris plasminogenactivator) is an anti-coagulant, binding to fibrin activating theendogenous plasminogen, leading to fibrin degradation. The cDNA of DSPAwas given to us by Berlix Biosciences Brisbane, Calif. for researchpurposes.

We have engineered the DSPA clone into a variety of vectors. With theglucoamylase vectors, we have replaced the 5' DSPA signal sequence witha Mro I site to facilitate cloning into the expression pGla-Mro and pGE.Following the Mro I site, we placed a kex-2 proteolytic site to removethe glucoamylase protein from DSPA. The expression vectors wereco-transformed with a selectable marker into competent Neurospora crassaspheroplasts.

In FIG. 9 we see the levels of DSPA produced by several expressionvectors. The samples in the boxed area represent the levels of DSPAproduced by the glucoamylase vectors. pGATE-TF contains the DSPA gene asa total fusion protein in the expression vector pGla-Mro. pGATEco, hasthe DSPA gene fused to the truncated glucoamylase vector pGE.

Selection of Transformants

Transformants were selected for pNom123 (the GUS reporter gene) byinitial selection for hygromycin-resistance. Expression of the GUSactivity was detected in a subsequent step by the development of bluecolour on X-gluc substrate.

With pTyr103, the derived plasmids with putative promoter inserts haveno independent selectable marker. They were co-transformed with a secondplasmid with a selectable marker, a process which gives circa 50%co-integration of the unselected plasmid. Although a number ofco-selectable plasmids are suitable, an example would be pFB6 (Buxtonand Radford 1984! Molec Gene Genet 190, 403-405), containing the clonedpyri-4 gene of Neurospora, selecting transformants by complementation ofa pyrimidine-requiring recipient strain. Transformants thus selecteddemonstrated promoter activity from the gla-1 promoter region byexpression of tyrosinase activity in vegetative culture, tyrosinase onlynormally being active in the sexual phase of the life cycle. Tyrosinaseactivity is again detected colourmetrically, by the conversion ofsupplied L-tyrosine to black melanin pigment, or of L-DOPA to a solublered pigment.

The red colour from L-DOPA, and the blue colour from X-gluc are bothquantitively assayable.

Types of Product

This process would be suitable for the expression in Neurospora, andfermentor-scale production, of a wide range of peptide products,especially suited for those mammalian peptides requiring glycosylationfor biological activity. It would be particularly appropriate forproduction of high value medical and pharmaceutical peptides, eginsulin, human growth hormone, interleukin, etc.

Method of Production

Neurospora grows well in large-scale fermentation conditions, in eitheraerated liquid fermentors or in solid state fermentations. It grows on awide range, of cheap carbon and nitrogen sources, and has a complexrequirement for only biotin and that in minuscule amounts.

Regulation of Production

The glucoamylase gene, and hence its promoter, is inducible by starch ormaltose, and repressible by glucose. Glucoamylase is the major exportedprotein in Neurospora when suitably induced (Koh-Laur et al 1989! EnzMicrob Technol 11). The derived expression and expression/exportcassettes may be designed to have or not have such regulation. Forcertain recombinant products, constitutive production may be desirable,and cassettes without regulatory sequences would be used. For otherrecombinant products, a short induced expression phase might beadvantageous or desirable. In such a case, production could be repressedinitially by growth on glucose as carbon source during log-phase growthof biomass (mycelium), after which exhaustion or removal or glucose andaddition of an inducing carbon source (starch or maltose) would lead toan induced expression phase.

In Summary

We have modified the glucoamylase gene of Neurospora crassa to improveits utility as a vector for heterologous gene expression. We have addeda convenient restriction site at the last codon of the glucoamylase geneopen reading frame to create a total fusion expression constructpGla-Mro I (see FIG. 3 SEQ ID NO:3). Further, we have reduced thetranscript size for glucoamylase expression by deleting 1575 base pairsof the glucoaxmylase gene open reading frame so creating the plasmid pGE(see FIG. 6 SEQ ID NO:2). In addition we engineered a restriction site13 base pairs from the termination codon of glucoamylase to place thecDNA of interest proximal to the polyadenalation site of theglucoamylase gene. Finally we have created an expression cassettecontaining the glucoamylase signal, propeptide and polyadenalationsites. This expression cassette is plasmid pGS (see FIG. 8 SEQ ID NO:1).

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 14                                                 (2) INFORMATION FOR SEQ ID NO: 1:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 5042 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: circular                                                        (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "Plasmid pGE"                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:                                      AAATTGTAAACGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCAT60                TTTTTAACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGACCGAGA120               TAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCA180               ACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACCCT240               AATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCC300               CCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAG360               CGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCA420               CACCCGCCGCGCTTAATGCGCCGCTACAGGGCGCGTCCCATTCGCCATTCAGGCTACGCA480               ACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAGGGGG540               GATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTA600               AAACGACGGCCAGTGAATTGTAATACGACTCACTATAGGGCGAATTGAGCTCCACCGCGG660               TGGCGGCCGCTCTAGCGATGGCAGCCACCATTCATTTCTCGATGCGACGGTAAACGACGC720               CCGCGGCAGATTAGGTCATTGCCGAACGGATTGAAGCTCTCTCCATCTTGGATCCATTCC780               CGGCCAATCCCGTCTCGGCCAACCACACTGTCCACTCGCCCAGGTCAGCAGCTCAGGACT840               CTCTCCTGGTTTGGTACCGCTTAGTGTAGAGCATACCGCTCTCAGTCCCCATAGACCAAC900               CATAACACCGCACGTTCTCTTTCACTCAAGATGCTTATCATGTCCCCTCTTTCTGCTCCA960               ATGATTCGGACTGGTCGAATACCAATGAGACAAGCGAGAGCGCAGTGCGAGCAAGCGTTC1020              CTGCAGATAGAGCAGTGGGACTGCCGCGCCACAAAGGAAGAGGATCGTGACGTGACGTGA1080              CCAGTGACCAGAAAGCAGAAGATCCAAAAGAGTCAAAAGGACCGAGCCTCACCTACAGTA1140              ATGGCCCGGATGGCACTCAAGACCGTCCTCTCGGCCCTTTCTCCAACTCTTCTCCTTCCA1200              TAATTCACCTAGGTACATACACGGCCTACGCTTCCGCCTCATCCCATCCCATCCCATCCC1260              ATCCCATCCCATCGACGACTCTAACCCGCCCGCGAGTGCAAACCTCGTCCACGAACGGAC1320              ACCCCGGCTCTCCTCCGAAGCCCTTGCAAGTGGAAGCTGAGGTTGCCGAACTTAGACGAC1380              CAGGTTCACCAGCCGGACCGCAACTCGAACGTCAGAATACAGCCTCAGCCTCCAAAGGGG1440              GTTAACGCCAAGCGAGAGCAAGACAAGATCGTCGCCCATCAATATCCTGGACAAGACAAC1500              ATGGACGCAATATATAACCTCAAGCAAGTCCTCCTCAGCAACCATGATTTCACCACCAGC1560              CTGGTCTCCAACGCAACAGACTTCTCGACAAGTCCCTTGACCTACTTCGCCATGCATCTC1620              GTCTCTTCGCTCCTCGTCGTGGGCGCCGCCTTCCAGGCCGTGCTCGGTCTGCCGGATCCT1680              CTGCATGAAAAGAGGCACAGCGACATCATCAAGCGGTCTGTCGACTCGTATATCCAGACC1740              GAGACTCCCATTGCGCAGAAGAACCTTCTGTGCAACATCGGTGCTTCTGGATGCAGAGCC1800              TCCGGTGCTGCCTCTGGTGTTGTGGTTGCCTCCCCTTCCAAGTCGAGCCCTGACTGTAAG1860              TGGAAATTGCACAGTGTGTCTCATCTCTCATGGCAGCATAGCTCACAGTGTCGATAGACT1920              GGTATACCTGGACTCGTGATGCCGCCCTTGTCACCAAGCTTATTGTCGACGAATTCCGGC1980              GCGCCCCCGGGTTAATTAAGTCTAGAGTGGAGGTAAATCGCTTGCTTCGTACTAGGTAGT2040              AAGTAGTGATTGGGAAAAGGAAATGAGAGAACGGGAACGGGAACGGGAACGGGAATTTGT2100              GATTACAAAGTGTAAAATTAATAGGCCCGGGATTTTGGTTAGATGCATAAGGGGGGCAGG2160              GGGGGCTAGGAAACGGAAGGTTGCATATCAACCGAGGAAGAATGGGAAGAAAGGGAAGAA2220              AGACAGAAAGAAGGAACAACAGGACTTCATTCTCTCACATCGACATGAGCTACCTGGGCA2280              TCAGCTACCTGGGCATCTTGATTTCCTTTTTAGAAGATTGTTTTGTATCCTTTTTTCTTC2340              CTCCCTTTTCTTTTCTTGTCCGTCTCTTACACCTACCTATTTTTAGCCAAAGTCCACACA2400              CACACAAACTTTTTGTTAGATATTCTCTGTATCAAAATTGACAAGTTTCAATGTTATACA2460              GTACCTTGCCAAGTTTAATACACATTCAAATCAATCAACCACACACACACAAGTTTTATT2520              GTGCAGAAATGGAGTGAAGAAGAAACATGTTTGGGATTATGATGACAAGCTTCTCAACAA2580              AATTTCAACGAGTTAAGCTTCAAAGGTCCGCTGGCTCAATGGCAGAGCGTCTGACTACGA2640              ATCAGGAGGTTCCAGGTTCGACCCCTGGGTGGATCGAGTTGCAAATTGGTACTTTGAGTA2700              CCAAAGTTCCTTTTTTTTTTTCGTTTGGCTCTCTGCTTTTCGACAGTTCACTGAGTCATG2760              TGCAAGACACCCCTGATCGGGTACGTACTGAACTGCTTTTGGTGCAGTGCAATGGTTCTC2820              GAGGGGGGGCCCGGTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTCCGAGCTTGGC2880              GTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAA2940              CATAGGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGGTAACTCAC3000              ATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCA3060              TTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTC3120              CTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTC3180              AAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGC3240              AAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAG3300              GCTCGGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCC3360              GACAGGACTATAAAGATACCAGGCGTTCCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGT3420              TCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCT3480              TTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGG3540              CTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCT3600              TGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGAT3660              TAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGG3720              CTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAA3780              AAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGT3840              TTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTC3900              TACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATT3960              ATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTA4020              AAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTAT4080              CTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTGCCCGTCGTGTAGATAAC4140              TACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACG4200              CTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAG4260              TGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGT4320              AAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGT4380              GTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGT4440              TACATGATCCCCCATGTTGTGAAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGT4500              CAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGCTTATGGCAGCACTGCATAATTCTCT4560              TACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATT4620              CTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATAC4680              CGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAA4740              ACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAA4800              CTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCA4860              AAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCT4920              TTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGA4980              ATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACC5040              TG5042                                                                        (2) INFORMATION FOR SEQ ID NO: 2:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 4792 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: circular                                                        (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "Plasmid pGS"                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:                                      AAATTGTAAACGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCAT60                TTTTTAACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGACCGAGA120               TAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCA180               ACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACCCT240               AATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCC300               CCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAG360               CGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCA420               CACCCGCCGCGCTTAATGCGCCGCTACAGGGCGCGTCCCATTCGCCATTCAGGCTACGCA480               ACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAGGGGG540               GATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTA600               AAACGACGGCCAGTGAATTGTAATACGACTCACTATAGGGCGAATTGAGCTCCACCGCGG660               TGGCGGCCGCTCTAGCGATGGCAGCCACCATTCATTTCTCGATGCGACGGTAAACGACGC720               CCGCGGCAGATTAGGTCATTGCCGAACGGATTGAAGCTCTCTCCATCTTGGATCCATTCC780               CGGCCAATCCCGTCTCGGCCAACCACACTGTCCACTCGCCCAGGTCAGCAGCTCAGGACT840               CTCTCCTGGTTTGGTACCGCTTAGTGTAGAGCATACCGCTCTCAGTCCCCATAGACCAAC900               CATAACACCGCACGTTCTCTTTCACTCAAGATGCTTATCATGTCCCCTCTTTCTGCTCCA960               ATGATTCGGACTGGTCGAATACCAATGAGACAAGCGAGAGCGCAGTGCGAGCAAGCGTTC1020              CTGCAGATAGAGCAGTGGGACTGCCGCGCCACAAAGGAAGAGGATCGTGACGTGACGTGA1080              CCAGTGACCAGAAAGCAGAAGATCCAAAAGAGTCAAAAGGACCGAGCCTCACCTACAGTA1140              ATGGCCCGGATGGCACTCAAGACCGTCCTCTCGGCCCTTTCTCCAACTCTTCTCCTTCCA1200              TAATTCACCTAGGTACATACACGGCCTACGCTTCCGCCTCATCCCATCCCATCCCATCCC1260              ATCCCATCCCATCGACGACTCTAACCCGCCCGCGAGTGCAAACCTCGTCCACGAACGGAC1320              ACCCCGGCTCTCCTCCGAAGCCCTTGCAAGTGGAAGCTGAGGTTGCCGAACTTAGACGAC1380              CAGGTTCACCAGCCGGACCGCAACTCGAACGTCAGAATACAGCCTCAGCCTCCAAAGGGG1440              GTTAACGCCAAGCGAGAGCAAGACAAGATCGTCGCCCATCAATATCCTGGACAAGACAAC1500              ATGGACGCAATATATAACCTCAAGCAAGTCCTCCTCAGCAACCATGATTTCACCACCAGC1560              CTGGTCTCCAACGCAACAGACTTCTCGACAAGTCCCTTGACCTACTTCGCCATGCATCTC1620              GTCTCTTCGCTCCTCGTCGTGGGCGCCGCCTTCCAGGCCGTGCTCGGTCTGCCGGATCCT1680              CTGCATGAAAAGAGGCACAGCGACATCATCAAGCGGTCTGTCGACCGGACGCGCCCCCGG1740              GTTAATTAAGTCTAGAGTGGAGGTAAATCGCTTGCTTCGTACTAGGTAGTAAGTAGTGAT1800              TGGGAAAAGGAAATGAGAGAACGGGAACGGGAACGGGAACGGGAATTTGTGATTACAAAG1860              TGTAAAATTAATAGGCCCGGGATTTTGGTTAGATGCATAAGGGGGGCAGGGGGGGCTAGG1920              AAACGGAAGGTTGCATATCAACCGAGGAAGAATGGGAAGAAAGGGAAGAAAGACAGAAAG1980              AAGGAACAACAGGACTTCATTCTCTCACATCGACATGAGCTACCTGGGCATCAGCTACCT2040              GGGCATCTTGATTTCCTTTTTAGAAGATTGTTTTGTATCCTTTTTTCTTCCTCCCTTTTC2100              TTTTCTTGTCCGTCTCTTACACCTACCTATTTTTAGCCAAAGTCCACACACACACAAACT2160              TTTTGTTAGATATTCTCTGTATCAAAATTGACAAGTTTCAATGTTATACAGTACCTTGCC2220              AAGTTTAATACACATTCAAATCAATCAACCACACACACACAAGTTTTATTGTGCAGAAAT2280              GGAGTGAAGAAGAAACATGTTTGGGATTATGATGACAAGCTTCTCAACAAAATTTCAACG2340              AGTTAAGCTTCAAAGGTCCGCTGGCTCAATGGCAGAGCGTCTGACTACGAATCAGGAGGT2400              TCCAGGTTCGACCCCTGGGTGGATCGAGTTGCAAATTGGTACTTTGAGTACCAAAGTTCC2460              TTTTTTTTTTTCGTTTGGCTCTCTGCTTTTCGACAGTTCACTGAGTCATGTGCAAGACAC2520              CCCTGATCGGGTACGTACTGAACTGCTTTTGGTGCAGTGCAATGGTTCTCGAGGGGGGGC2580              CCGGTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTCCGAGCTTGGCGTAATCATGG2640              TCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATAGGAGCC2700              GGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGGTAACTCACATTAATTGCG2760              TTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATC2820              GGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACT2880              GACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTA2940              ATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAG3000              CAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCGGCCCC3060              CCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTA3120              TAAAGATACCAGGCGTTCCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTG3180              CCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGC3240              TCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCAC3300              GAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAAC3360              CCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCG3420              AGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGA3480              AGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGT3540              AGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAG3600              CAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCT3660              GACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGG3720              ATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATAT3780              GAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATC3840              TGTCTATTTCGTTCATCCATAGTTGCCTGACTGCCCGTCGTGTAGATAACTACGATACGG3900              GAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCT3960              CCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCA4020              ACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCG4080              CCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCG4140              TCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCC4200              CCCATGTTGTGAAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAG4260              TTGGCCGCAGTGTTATCACTCATGCTTATGGCAGCACTGCATAATTCTCTTACTGTCATG4320              CCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAG4380              TGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACAT4440              AGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGG4500              ATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCA4560              GCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCA4620              AAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATAT4680              TATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAG4740              AAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTG4792                      (2) INFORMATION FOR SEQ ID NO: 3:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 3796 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: circular                                                        (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "Plasmid pgla Xho-"                                  (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:                                      AAAAGCTGGAATTCGAGCTCCACCGCGGTGGCGGCCGCTCTAGAACTAGTGGATCCCCCG60                GGCTGCAGGAATTCGATATCAAGCTTATCGATGGCAGCCACCATTCATTTCTCGATGCGA120               CGGTAAACGACGCCCGCGGCAGATTAGGTCATTGCCGAACGGATTGAAGCTCTCTCCATC180               TTGGATCCATTCCCGGCCAATCCCGTCTCGGCCAACCACACTGTCCACTCGCCCAGGTCA240               GCAGCTCAGGACTCTCTCCTGGTTTGGTACCGCTTAGTGTAGAGCATACCGCTCTCAGTC300               CCCATAGACCAACCATAACACCGCACGTTCTCTTTCACTCAAGATGCTTATCATGTCCCC360               TCTTTCTGCTCCAATGATTCGGACTGGTCGAATACCAATGAGACAAGCGAGAGCGCAGTG420               CGAGCAAGCGTTCCTGCAGATAGAGCAGTGGGACTGCCGCGCCACAAAGGAAGAGGATCG480               TGACGTGACGTGACCAGTGACCAGAAAGCAGAAGATCCAAAAGAGTCAAAAGGACCGAGC540               CTCACCTACAGTAATGGCCCGGATGGCACTCAAGACCGTCCTCTCGGCCCTTTCTCCAAC600               TCTTCTCCTTCCATAATTCACCTAGGTACATACACGGCCTACGCTTCCGCCTCATCCCAT660               CCCATCCCATCCCATCCCATCCCATCGACGACTCTAACCCGCCCGCGAGTGCAAACCTCG720               TCCACGAACGGACACCCCGGCTCTCCTCCGAAGCCCTTGCAAGTGGAAGCTGAGGTTGCC780               GAACTTAGACGACCAGGTTCACCAGCCGGACCGCAACTCGAACGTCAGAATACAGCCTCA840               GCCTCCAAAGGGGGTTAACGCCAAGCGAGAGCAAGACAAGATCGTCGCCCATCAATATCC900               TGGACAAGACAACATGGACGCAATATATAACCTCAAGCAAGTCCTCCTCAGCAACCATGA960               TTTCACCACCAGCCTGGTCTCCAACGCAACAGACTTCTCGACAAGTCCCTTGACCTACTT1020              CGCCATGCATCTCGTCTCTTCGCTCCTCGTCGTGGGCGCCGCCTTCCAGGCCGTGCTCGG1080              TCTGCCGGATCCTCTGCATGAAAAGAGGCACAGCGACATCATCAAGCGGTCTGTCGACTC1140              GTATATCCAGACCGAGACTCCCATTGCGCAGAAGAACCTTCTGTGCAACATCGGTGCTTC1200              TGGATGCAGAGCCTCCGGTGCTGCCTCTGGTGTTGTGGTTGCCTCCCCTTCCAAGTCGAG1260              CCCTGACTGTAAGTGGAAATTGCACAGTGTGTCTCATCTCTCATGGCAGCATAGCTCACA1320              GTGTCGATAGACTGGTATACCTGGACTCGTGATGCCGCCCTTGTCACCAAGCTTATTGTC1380              GACGAATTCACCAACGACTACAACACCACTCTTCAGAACACCATTCAGGCTTATGCTGCT1440              GCACAGGCCAAGCTTCAGGGCGTTAGCAACCCGTCCGGTTCCCTCTCCAACGGGGCCGGT1500              CTTGGTGAGCCCAAGTTCATGGTCGACCTCCAGCAGTTCACCGGTGCCTGGGGCCGCCCC1560              CAGAGGGATGGCCCTCCCCTTCGCGCCATTGCCCTGATCGGCTATGGCAAGTGGCTCGTC1620              AGCAACGGTTATGCTGATACGGCCAAGAGCATCATCTGGCCCATTGTGAAGAACGACCTT1680              GCCTACACTGCCCAGTACTGGAACAACACTGGCTTCGATCTCTGGGAGGAGGTTAACAGC1740              TCTTCTTTCTTCACCATCGCCGCCTCCCACCGTGCTCTCGTTGAGGGTTCTGCTTTTGCC1800              AAGTCCGTCGGCAGCTCTTGCAGCGCTTGCGATGCCATTGCCCCCCAAATTCTGTGCTTC1860              CAGCAGAGCTTCTGGTCCAACAGCGGCTACATCATCTCCAACTTTGTCAACTACCGCAGC1920              GGCAAGGACATCAACTCCGTCTTGACTTCCATCCACAACTTCGACCCCGCTGCCGGTTGC1980              GATGTCAACACCTTCCAGCCCTGCAGCGACCGGGCTCTTGCCAACCACAAGGTTGTCGTT2040              GACTCCATGCGCTTCTGGGGTGTCAACTCCGGTCGCACTGCCGGTAAGGCCGCCGCTGTC2100              GGTCGCTACGCTGAGGATGTCTACTACAACGGTAACCCGTGGTACCTCGCTACTCTCGCC2160              GCCGCCGAGCAGCTCTACGACGCCGTCTACGTCTGGAAGAAGCAGGGTTCTATCACTGTC2220              ACCTCCACCTCCCTCGCCTTCTTCAAGGACCTCGTTCCCTCCGTCAGCACCGGCACCTAC2280              TCCAGCTCTTCCTCCACCTACACCGCCATCATCAACGCCGTCACCACCTATGCCGACGGC2340              TTCGTCGACATCGTTGCCCAGTACACTCCCTCCGACGGCTCCCTGGCCGAGCAGTTCGAC2400              AAGGATTCGGGCGCCCCCCTCAGCGCCACCCACCTGACCTGGTCGTACGCCTCCTTCCTT2460              TCCGCCGCCGCCCGCCGCGCCGGCATCGTCCCTCCCTCGTGGGGCGCCGCGTCCGCCAAC2520              TCTCTGCCCGGTTCCTGCTCCGCCTCCACCGTCGCCGGTTCATACGCCACCGCGACTGCC2580              ACCTCCTTTCCCGCCAACCTCACGCCCGCCAGCACCACCGTCACCCCTCCCACGCAGACC2640              GGCTGCGCCGCCGACCACGAGGTTTTGGTAACTTTCAACGAAAAGGTCACCACCAGCTAT2700              GGTCAGACGGTCAAGGTCGTCGGCAGCATCGCTCGGCTCGGCAACTGGGCCCCCGCCAGC2760              GGGCTCACCCTGTCGGCCAAACAGTACTCTTCCAGCAACCCGCTCTGGTCCACCACTATT2820              GCGCTGCCCCAGGGCACCTCGTTCAAGTACAAGTATGTCGTCGTCAACTCGGATGGGTCC2880              GTCAAGTGGGAGAACGATCCTGACCGCAGCTATGCTGTTGGGACGGACTGCGCCTCTACT2940              GCGACTCTTGATGATACGTGGAGGTAAATCGCTTGCTTCGTACTAGGTAGTAAGTAGTGA3000              TTGGGAAAAGGAAATGAGAGAACGGGAACGGGAACGGGAACGGGAATTTGTGATTACAAA3060              GTGTAAAATTAATAGGCCCGGGATTTTGGTTAGATGCATAAGGGGGGCAGGGGGGGCTAG3120              GAAACGGAAGGTTGCATATCAACCGAGGAAGAATGGGAAGAAAGGGAAGAAAGACAGAAA3180              GAAGGAACAACAGGACTTCATTCTCTCACATCGACATGAGCTACCTGGGCATCAGCTACC3240              TGGGCATCTTGATTTCCTTTTTAGAAGATTGTTTTGTATCCTTTTTTCTTCCTCCCTTTT3300              CTTTTCTTGTCCGTCTCTTACACCTACCTATTTTTAGCCAAAGTCCACACACACACAAAC3360              TTTTTGTTAGATATTCTCTGTATCAAAATTGACAAGTTTCAATGTTATACAGTACCTTGC3420              CAAGTTTAATACACATTCAAATCAATCAACCACACACACACAAGTTTTATTGTGCAGAAA3480              TGGAGTGAAGAAGAAACATGTTTGGGATTATGATGACAAGCTTCTCAACAAAATTTCAAC3540              GAGTTAAGCTTCAAAGGTCCGCTGGCTCAATGGCAGAGCGTCTGACTACGAATCAGGAGG3600              TTCCAGGTTCGACCCCTGGGTGGATCGAGTTGCAAATTGGTACTTTGAGTACCAAAGTTC3660              CTTTTTTTTTTTCGTTTGGCTCTCTGCTTTTCGACAGTTCACTGAGTCATGTGCAAGACA3720              CCCCTGATCGGGTACGTACTGAACTGCTTTTGGTGCAGTGCAATGGTTCTCGAGGGGGGG3780              CCCGGTACCCAATTCG3796                                                          (2) INFORMATION FOR SEQ ID NO: 4:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 1881 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:                                      ATGCATCTCGTCTCTTCGCTCCTCGTCGTGGGCGCCGCCTTCCAGGCCGTGCTCGGTCTG60                CCGGATCCTCTGCATGAAAAGAGGCACAGCGACATCATCAAGCGGTCTGTCGACTCGTAT120               ATCCAGACCGAGACTCCCATTGCGCAGAAGAACCTTCTGTGCAACATCGGTGCTTCTGGA180               TGCAGAGCCTCCGGTGCTGCCTCTGGTGTTGTGGTTGCCTCCCCTTCCAAGTCGAGCCCT240               GACTACTGGTATACCTGGACTCGTGATGCCGCCCTTGTCACCAAGCTTATTGTCGACGAA300               TTCACCAACGACTACAACACCACTCTTCAGAACACCATTCAGGCTTATGCTGCTGCACAG360               GCCAAGCTTCAGGGCGTTAGCAACCCGTCCGGTTCCCTCTCCAACGGGGCCGGTCTTGGT420               GAGCCCAAGTTCATGGTCGACCTCCAGCAGTTCACCGGTGCCTGGGGCCGCCCCCAGAGG480               GATGGCCCTCCCCTTCGCGCCATTGCCCTGATCGGCTATGGCAAGTGGCTCGTCAGCAAC540               GGTTATGCTGATACGGCCAAGAGCATCATCTGGCCCATTGTGAAGAACGACCTTGCCTAC600               ACTGCCCAGTACTGGAACAACACTGGCTTCGATCTCTGGGAGGAGGTTAACAGCTCTTCT660               TTCTTCACCATCGCCGCCTCCCACCGTGCTCTCGTTGAGGGTTCTGCTTTTGCCAAGTCC720               GTCGGCAGCTCTTGCAGCGCTTGCGATGCCATTGCCCCCCAAATTCTGTGCTTCCAGCAG780               AGCTTCTGGTCCAACAGCGGCTACATCATCTCCAACTTTGTCAACTACCGCAGCGGCAAG840               GACATCAACTCCGTCTTGACTTCCATCCACAACTTCGACCCCGCTGCCGGTTGCGATGTC900               AACACCTTCCAGCCCTGCAGCGACCGGGCTCTTGCCAACCACAAGGTTGTCGTTGACTCC960               ATGCGCTTCTGGGGTGTCAACTCCGGTCGCACTGCCGGTAAGGCCGCCGCTGTCGGTCGC1020              TACGCTGAGGATGTCTACTACAACGGTAACCCGTGGTACCTCGCTACTCTCGCCGCCGCC1080              GAGCAGCTCTACGACGCCGTCTACGTCTGGAAGAAGCAGGGTTCTATCACTGTCACCTCC1140              ACCTCCCTCGCCTTCTTCAAGGACCTCGTTCCCTCCGTCAGCACCGGCACCTACTCCAGC1200              TCTTCCTCCACCTACACCGCCATCATCAACGCCGTCACCACCTATGCCGACGGCTTCGTC1260              GACATCGTTGCCCAGTACACTCCCTCCGACGGCTCCCTGGCCGAGCAGTTCGACAAGGAT1320              TCGGGCGCCCCCCTCAGCGCCACCCACCTGACCTGGTCGTACGCCTCCTTCCTTTCCGCC1380              GCCGCCCGCCGCGCCGGCATCGTCCCTCCCTCGTGGGGCGCCGCGTCCGCCAACTCTCTG1440              CCCGGTTCCTGCTCCGCCTCCACCGTCGCCGGTTCATACGCCACCGCGACTGCCACCTCC1500              TTTCCCGCCAACCTCACGCCCGCCAGCACCACCGTCACCCCTCCCACGCAGACCGGCTGC1560              GCCGCCGACCACGAGGTTTTGGTAACTTTCAACGAAAAGGTCACCACCAGCTATGGTCAG1620              ACGGTCAAGGTCGTCGGCAGCATCGCTCGGCTCGGCAACTGGGCCCCCGCCAGCGGGCTC1680              ACCCTGTCGGCCAAACAGTACTCTTCCAGCAACCCGCTCTGGTCCACCACTATTGCGCTG1740              CCCCAGGGCACCTCGTTCAAGTACAAGTATGTCGTCGTCAACTCGGATGGGTCCGTCAAG1800              TGGGAGAACGATCCTGACCGCAGCTATGCTGTTGGGACGGACTGCGCCTCTACTGCGACT1860              CTTGATGATACGTGGAGGTAA1881                                                     (2) INFORMATION FOR SEQ ID NO: 5:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 3041 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: circular                                                        (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "Plasmid pgla-mro"                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:                                      AAAAGCTGGAATTCGAGCTCCACCGCGGTGGCGGCCGCTCTAGAACTAGTGGATCCCCCG60                GGCTGCAGGAATTCGATATCAAGCTTATCGATGGCAGCCACCATTCATTTCTCGATGCGA120               CGGTAAACGACGCCCGCGGCAGATTAGGTCATTGCCGAACGGATTGAAGCTCTCTCCATC180               TTGGATCCATTCCCGGCCAATCCCGTCTCGGCCAACCACACTGTCCACTCGCCCAGGTCA240               GCAGCTCAGGACTCTCTCCTGGTTTGGTACCGCTTAGTGTAGAGCATACCGCTCTCAGTC300               CCCATAGACCAACCATAACACCGCACGTTCTCTTTCACTCAAGATGCTTATCATGTCCCC360               TCTTTCTGCTCCAATGATTCGGACTGGTCGAATACCAATGAGACAAGCGAGAGCGCAGTG420               CGAGCAAGCGTTCCTGCAGATAGAGCAGTGGGACTGCCGCGCCACAAAGGAAGAGGATCG480               TGACGTGACGTGACCAGTGACCAGAAAGCAGAAGATCCAAAAGAGTCAAAAGGACCGAGC540               CTCACCTACAGTAATGGCCCGGATGGCACTCAAGACCGTCCTCTCGGCCCTTTCTCCAAC600               TCTTCTCCTTCCATAATTCACCTAGGTACATACACGGCCTACGCTTCCGCCTCATCCCAT660               CCCATCCCATCCCATCCCATCCCATCGACGACTCTAACCCGCCCGCGAGTGCAAACCTCG720               TCCACGAACGGACACCCCGGCTCTCCTCCGAAGCCCTTGCAAGTGGAAGCTGAGGTTGCC780               GAACTTAGACGACCAGGTTCACCAGCCGGACCGCAACTCGAACGTCAGAATACAGCCTCA840               GCCTCCAAAGGGGGTTAACGCCAAGCGAGAGCAAGACAAGATCGTCGCCCATCAATATCC900               TGGACAAGACAACATGGACGCAATATATAACCTCAAGCAAGTCCTCCTCAGCAACCATGA960               TTTCACCACCAGCCTGGTCTCCAACGCAACAGACTTCTCGACAAGTCCCTTGACCTACTT1020              CGCCATGCATCTCGTCTCTTCGCTCCTCGTCGTGGGCGCCGCCTTCCAGGCCGTGCTCGG1080              TCTGCCGGATCCTCTGCATGAAAAGAGGCACAGCGACATCATCAAGCGGTCTGTCGACTC1140              GTATATCCAGACCGAGACTCCCATTGCGCAGAAGAACCTTCTGTGCAACATCGGTGCTTC1200              TGGATGCAGAGCCTCCGGTGCTGCCTCTGGTGTTGTGGTTGCCTCCCCTTCCAAGTCGAG1260              CCCTGACTGTAAGTGGAAATTGCACAGTGTGTCTCATCTCTCATGGCAGCATAGCTCACA1320              GTGTCGATAGACTGGTATACCTGGACTCGTGATGCCGCCCTTGTCACCAAGCTTATTGTC1380              GACGAATTCACCAACGACTACAACACCACTCTTCAGAACACCATTCAGGCTTATGCTGCT1440              GCACAGGCCAAGCTTCAGGGCGTTAGCAACCCGTCCGGTTCCCTCTCCAACGGGGCCGGT1500              CTTGGTGAGCCCAAGTTCATGGTCGACCTCCAGCAGTTCACCGGTGCCTGGGGCCGCCCC1560              CAGAGGGATGGCCCTCCCCTTCGCGCCATTGCCCTGATCGGCTATGGCAAGTGGCTCGTC1620              AGCAACGGTTATGCTGATACGGCCAAGAGCATCATCTGGCCCATTGTGAAGAACGACCTT1680              GCCTACACTGCCCAGTACTGGAACAACACTGGCTTCGATCTCTGGGAGGAGGTTAACAGC1740              TCTTCTTTCTTCACCATCGCCGCCTCCCACCGTGCTCTCGTTGAGGGTTCTGCTTTTGCC1800              AAGTCCGTCGGCAGCTCTTGCAGCGCTTGCGATGCCATTGCCCCCCAAATTCTGTGCTTC1860              CAGCAGAGCTTCTGGTCCAACAGCGGCTACATCATCTCCAACTTTGTCAACTACCGCAGC1920              GGCAAGGACATCAACTCCGTCTTGACTTCCATCCACAACTTCGACCCCGCTGCCGGTTGC1980              GATGTCAACACCTTCCAGCCCTGCAGCGACCGGGCTCTTGCCAACCACAAGGTTGTCGTT2040              GACTCCATGCGCTTCTGGGGTGTCAACTCCGGTCGCACTGCCGGTAAGGCCGCCGCTGTC2100              GGTCGCTACGCTGAGGATGTCTACTACAACGGTAACCCGTGGTACCTCGCTACTCTCGCC2160              GCCGCCGAGCAGCTCTACGACGCCGTCTACGTCTGGAAGAAGCAGGGTTCTATCACTGTC2220              ACCTCCACCTCCCTCGCCTTCTTCAAGGACCTCGTTCCCTCCGTCAGCACCGGCACCTAC2280              TCCAGCTCTTCCTCCACCTACACCGCCATCATCAACGCCGTCACCACCTATGCCGACGGC2340              TTCGTCGACATCGTTGCCCAGTACACTCCCTCCGACGGCTCCCTGGCCGAGCAGTTCGAC2400              AAGGATTCGGGCGCCCCCCTCAGCGCCACCCACCTGACCTGGTCGTACGCCTCCTTCCTT2460              TCCGCCGCCGCCCGCCGCGCCGGCATCGTCCCTCCCTCGTGGGGCGCCGCGTCCGCCAAC2520              TCTCTGCCCGGTTCCTGCTCCGCCTCCACCGTCGCCGGTTCATACGCCACCGCGACTGCC2580              ACCTCCTTTCCCGCCAACCTCACGCCCGCCAGCACCACCGTCACCCCTCCCACGCAGACC2640              GGCTGCGCCGCCGACCACGAGGTTTTGGTAACTTTCAACGAAAAGGTCACCACCAGCTAT2700              GGTCAGACGGTCAAGGTCGTCGGCAGCATCGCTCGGCTCGGCAACTGGGCCCCCGCCAGC2760              GGGCTCACCCTGTCGGCCAAACAGTACTCTTCCAGCAACCCGCTCTGGTCCACCACTATT2820              GCGCTGCCCCAGGGCACCTCGTTCAAGTACAAGTATGTCGTCGTCAACTCGGATGGGTCC2880              GTCAAGTGGGAGAACGATCCTGACCGCAGCTATGCTGTTGGGACGGACTGCGCCTCTACT2940              GCGACTCTTGATGATATCCGGAGGAGGTAAATCGCCGGGGGCGCGCCGGATCCTTAATTA3000              AGTCTAGAGTCGACTGTTTAAACCTGCAGGCATGCAAGCTT3041                                 (2) INFORMATION FOR SEQ ID NO: 6:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 3718 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: Neurospora crassa                                               (B) STRAIN: Oak Ridge (=St Lawrence) 74A                                      (vii) IMMEDIATE SOURCE:                                                       (A) LIBRARY: lambda J1                                                        (B) CLONE: lambda J1 glam                                                     (x) PUBLICATION INFORMATION:                                                  (A) AUTHORS: Stone, P J                                                       Makoff, A J                                                                   Parish, J H                                                                   Radford, A                                                                    (B) TITLE: Cloning and sequence analysis of the                               glucoamylase gene of Neurospora crassa                                        (C) JOURNAL: Curr. Genet.                                                     (D) VOLUME: 24                                                                (F) PAGES: 205-211                                                            (G) DATE: 1993                                                                (K) RELEVANT RESIDUES IN SEQ ID NO: 6: FROM 1 TO 3718                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:                                      ATCGATGGCAGCCACCATTCATTTCTCGATGCGACGGTAAACGACGCCCGCGGCAGATTA60                GGTCATTGCCGAACGGATTGAAGCTCTCTCCATCTTGGATCCATTCCCGGCCAATCCCGT120               CTCGGCCAACCACACTGTCCACTCGCCCAGGTCAGCAGCTCAGGACTCTCTCCTGGTTTG180               GTACCGCTTAGTGTAGAGCATACCGCTCTCAGTCCCCATAGACCAACCATAACACCGCAC240               GTTCTCTTTCACTCAAGATGCTTATCATGTCCCCTCTTTCTGCTCCAATGATTCGGACTG300               GTCGAATACCAATGAGACAAGCGAGAGCGCAGTGCGAGCAAGCGTTCCTGCAGATAGAGC360               AGTGGGACTGCCGCGCCACAAAGGAAGAGGATCGTGACGTGACGTGACCAGTGACCAGAA420               AGCAGAAGATCCAAAAGAGTCAAAAGGACCGAGCCTCACCTACAGTAATGGCCCGGATGG480               CACTCAAGACCGTCCTCTCGGCCCTTTCTCCAACTCTTCTCCTTCCATAATTCACCTAGG540               TACATACACGGCCTACGCTTCCGCCTCATCCCATCCCATCCCATCCCATCCCATCCCATC600               GACGACTCTAACCCGCCCGCGAGTGCAAACCTCGTCCACGAACGGACACCCCGGCTCTCC660               TCCGAAGCCCTTGCAAGTGGAAGCTGAGGTTGCCGAACTTAGACGACCAGGTTCACCAGC720               CGGACCGCAACTCGAACGTCAGAATACAGCCTCAGCCTCCAAAGGGGGTTAACGCCAAGC780               GAGAGCAAGACAAGATCGTCGCCCATCAATATCCTGGACAAGACAACATGGACGCAATAT840               ATAACCTCAAGCAAGTCCTCCTCAGCAACCATGATTTCACCACCAGCCTGGTCTCCAACG900               CAACAGACTTCTCGACAAGTCCCTTGACCTACTTCGCCATGCATCTCGTCTCTTCGCTCC960               TCGTCGTGGGCGCCGCCTTCCAGGCCGTGCTCGGTCTGCCGGATCCTCTGCATGAAAAGA1020              GGCACAGCGACATCATCAAGCGGTCTGTCGACTCGTATATCCAGACCGAGACTCCCATTG1080              CGCAGAAGAACCTTCTGTGCAACATCGGTGCTTCTGGATGCAGAGCCTCCGGTGCTGCCT1140              CTGGTGTTGTGGTTGCCTCCCCTTCCAAGTCGAGCCCTGACTGTAAGTGGAAATTGCACA1200              GTGTGTCTCATCTCTCATGGCAGCATAGCTCACAGTGTCGATAGACTGGTATACCTGGAC1260              TCGTGATGCCGCCCTTGTCACCAAGCTTATTGTCGACGAATTCACCAACGACTACAACAC1320              CACTCTTCAGAACACCATTCAGGCTTATGCTGCTGCACAGGCCAAGCTTCAGGGCGTTAG1380              CAACCCGTCCGGTTCCCTCTCCAACGGGGCCGGTCTTGGTGAGCCCAAGTTCATGGTCGA1440              CCTCCAGCAGTTCACCGGTGCCTGGGGCCGCCCCCAGAGGGATGGCCCTCCCCTTCGCGC1500              CATTGCCCTGATCGGCTATGGCAAGTGGCTCGTCAGCAACGGTTATGCTGATACGGCCAA1560              GAGCATCATCTGGCCCATTGTGAAGAACGACCTTGCCTACACTGCCCAGTACTGGAACAA1620              CACTGGCTTCGATCTCTGGGAGGAGGTTAACAGCTCTTCTTTCTTCACCATCGCCGCCTC1680              CCACCGTGCTCTCGTTGAGGGTTCTGCTTTTGCCAAGTCCGTCGGCAGCTCTTGCAGCGC1740              TTGCGATGCCATTGCCCCCCAAATTCTGTGCTTCCAGCAGAGCTTCTGGTCCAACAGCGG1800              CTACATCATCTCCAACTTTGTCAACTACCGCAGCGGCAAGGACATCAACTCCGTCTTGAC1860              TTCCATCCACAACTTCGACCCCGCTGCCGGTTGCGATGTCAACACCTTCCAGCCCTGCAG1920              CGACCGGGCTCTTGCCAACCACAAGGTTGTCGTTGACTCCATGCGCTTCTGGGGTGTCAA1980              CTCCGGTCGCACTGCCGGTAAGGCCGCCGCTGTCGGTCGCTACGCTGAGGATGTCTACTA2040              CAACGGTAACCCGTGGTACCTCGCTACTCTCGCCGCCGCCGAGCAGCTCTACGACGCCGT2100              CTACGTCTGGAAGAAGCAGGGTTCTATCACTGTCACCTCCACCTCCCTCGCCTTCTTCAA2160              GGACCTCGTTCCCTCCGTCAGCACCGGCACCTACTCCAGCTCTTCCTCCACCTACACCGC2220              CATCATCAACGCCGTCACCACCTATGCCGACGGCTTCGTCGACATCGTTGCCCAGTACAC2280              TCCCTCCGACGGCTCCCTGGCCGAGCAGTTCGACAAGGATTCGGGCGCCCCCCTCAGCGC2340              CACCCACCTGACCTGGTCGTACGCCTCCTTCCTTTCCGCCGCCGCCCGCCGCGCCGGCAT2400              CGTCCCTCCCTCGTGGGGCGCCGCGTCCGCCAACTCTCTGCCCGGTTCCTGCTCCGCCTC2460              CACCGTCGCCGGTTCATACGCCACCGCGACTGCCACCTCCTTTCCCGCCAACCTCACGCC2520              CGCCAGCACCACCGTCACCCCTCCCACGCAGACCGGCTGCGCCGCCGACCACGAGGTTTT2580              GGTAACTTTCAACGAAAAGGTCACCACCAGCTATGGTCAGACGGTCAAGGTCGTCGGCAG2640              CATCGCTCGGCTCGGCAACTGGGCCCCCGCCAGCGGGCTCACCCTGTCGGCCAAACAGTA2700              CTCTTCCAGCAACCCGCTCTGGTCCACCACTATTGCGCTGCCCCAGGGCACCTCGTTCAA2760              GTACAAGTATGTCGTCGTCAACTCGGATGGGTCCGTCAAGTGGGAGAACGATCCTGACCG2820              CAGCTATGCTGTTGGGACGGACTGCGCCTCTACTGCGACTCTTGATGATACGTGGAGGTA2880              AATCGCTTGCTTCGTACTAGGTAGTAAGTAGTGATTGGGAAAAGGAAATGAGAGAACGGG2940              AACGGGAACGGGAACGGGAATTTGTGATTACAAAGTGTAAAATTAATAGGCCCGGGATTT3000              TGGTTAGATGCATAAGGGGGGCAGGGGGGGCTAGGAAACGGAAGGTTGCATATCAACCGA3060              GGAAGAATGGGAAGAAAGGGAAGAAAGACAGAAAGAAGGAACAACAGGACTTCATTCTCT3120              CACATCGACATGAGCTACCTGGGCATCAGCTACCTGGGCATCTTGATTTCCTTTTTAGAA3180              GATTGTTTTGTATCCTTTTTTCTTCCTCCCTTTTCTTTTCTTGTCCGTCTCTTACACCTA3240              CCTATTTTTAGCCAAAGTCCACACACACACAAACTTTTTGTTAGATATTCTCTGTATCAA3300              AATTGACAAGTTTCAATGTTATACAGTACCTTGCCAAGTTTAATACACATTCAAATCAAT3360              CAACCACACACACACAAGTTTTATTGTGCAGAAATGGAGTGAAGAAGAAACATGTTTGGG3420              ATTATGATGACAAGCTTCTCAACAAAATTTCAACGAGTTAAGCTTCAAAGGTCCGCTGGC3480              TCAATGGCAGAGCGTCTGACTACGAATCAGGAGGTTCCAGGTTCGACCCCTGGGTGGATC3540              GAGTTGCAAATTGGTACTTTGAGTACCAAAGTTCCTTTTTTTTTTTCGTTTGGCTCTCTG3600              CTTTTCGACAGTTCACTGAGTCATGTGCAAGACACCCCTGATCGGGTACGTACTGAACTG3660              CTTTTGGTGCAGTGCAATGGTTCTCGAGTGCAAGGGATGAAAGGAAGATATGTCTTGG3718                (2) INFORMATION FOR SEQ ID NO:7:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 626 amino acids                                                   (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (iii) HYPOTHETICAL: NO                                                        (v) FRAGMENT TYPE:                                                            (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                       MetHisLeuValSerSerLeuLeuValValGlyAlaAlaPheGlnAla                              151015                                                                        ValLeuGlyLeuProAspProLeuHisGluLysArgHisSerAspIle                              202530                                                                        IleLysArgSerValAspSerTyrIleGlnThrGluThrProIleAla                              354045                                                                        GlnLysAsnLeuLeuCysAsnIleGlyAlaSerGlyCysArgAlaSer                              505560                                                                        GlyAlaAlaSerGlyValValValAlaSerProSerLysSerSerPro                              65707580                                                                      AspTyrTrpTyrThrTrpThrArgAspAlaAlaLeuValThrLysLeu                              859095                                                                        IleValAspGluPheThrAsnAspTyrAsnThrThrLeuGlnAsnThr                              100105110                                                                     IleGlnAlaTyrAlaAlaAlaGlnAlaLysLeuGlnGlyValSerAsn                              115120125                                                                     ProSerGlySerLeuSerAsnGlyAlaGlyLeuGlyGluProLysPhe                              130135140                                                                     MetValAspLeuGlnGlnPheThrGlyAlaTrpGlyArgProGlnArg                              145150155160                                                                  AspGlyProProLeuArgAlaIleAlaLeuIleGlyTyrGlyLysTrp                              165170175                                                                     LeuValSerAsnGlyTyrAlaAspThrAlaLysSerIleIleTrpPro                              180185190                                                                     IleValLysAsnAspLeuAlaTyrThrAlaGlnTyrTrpAsnAsnThr                              195200205                                                                     GlyPheAspLeuTrpGluGluValAsnSerSerSerPhePheThrIle                              210215220                                                                     AlaAlaSerHisArgAlaLeuValGluGlySerAlaPheAlaLysSer                              225230235240                                                                  ValGlySerSerCysSerAlaCysAspAlaIleAlaProGlnIleLeu                              245250255                                                                     CysPheGlnGlnSerPheTrpSerAsnSerGlyTyrIleIleSerAsn                              260265270                                                                     PheValAsnTyrArgSerGlyLysAspIleAsnSerValLeuThrSer                              275280285                                                                     IleHisAsnPheAspProAlaAlaGlyCysAspValAsnThrPheGln                              290295300                                                                     ProCysSerAspArgAlaLeuAlaAsnHisLysValValValAspSer                              305310315320                                                                  MetArgPheTrpGlyValAsnSerGlyArgThrAlaGlyLysAlaAla                              325330335                                                                     AlaValGlyArgTyrAlaGluAspValTyrTyrAsnGlyAsnProTrp                              340345350                                                                     TyrLeuAlaThrLeuAlaAlaAlaGluGlnLeuTyrAspAlaValTyr                              355360365                                                                     ValTrpLysLysGlnGlySerIleThrValThrSerThrSerLeuAla                              370375380                                                                     PhePheLysAspLeuValProSerValSerThrGlyThrTyrSerSer                              385390395400                                                                  SerSerSerThrTyrThrAlaIleIleAsnAlaValThrThrTyrAla                              405410415                                                                     AspGlyPheValAspIleValAlaGlnTyrThrProSerAspGlySer                              420425430                                                                     LeuAlaGluGlnPheAspLysAspSerGlyAlaProLeuSerAlaThr                              435440445                                                                     HisLeuThrTrpSerTyrAlaSerPheLeuSerAlaAlaAlaArgArg                              450455460                                                                     AlaGlyIleValProProSerTrpGlyAlaAlaSerAlaAsnSerLeu                              465470475480                                                                  ProGlySerCysSerAlaSerThrValAlaGlySerTyrAlaThrAla                              485490495                                                                     ThrAlaThrSerPheProAlaAsnLeuThrProAlaSerThrThrVal                              500505510                                                                     ThrProProThrGlnThrGlyCysAlaAlaAspHisGluValLeuVal                              515520525                                                                     ThrPheAsnGluLysValThrThrSerTyrGlyGlnThrValLysVal                              530535540                                                                     ValGlySerIleAlaArgLeuGlyAsnTrpAlaProAlaSerGlyLeu                              545550555560                                                                  ThrLeuSerAlaLysGlnTyrSerSerSerAsnProLeuTrpSerThr                              565570575                                                                     ThrIleAlaLeuProGlnGlyThrSerPheLysTyrLysTyrValVal                              580585590                                                                     ValAsnSerAspGlySerValLysTrpGluAsnAspProAspArgSer                              595600605                                                                     TyrAlaValGlyThrAspCysAlaSerThrAlaThrLeuAspAspThr                              610615620                                                                     TrpArg                                                                        625                                                                           (2) INFORMATION FOR SEQ ID NO:8:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 27 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "5'primer at the Ppum I                              site"                                                                         (iii) HYPOTHETICAL: NO                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                       CCTTCTTCAAGGACCTCGTTCCCTCCG27                                                 (2) INFORMATION FOR SEQ ID NO:9:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 27 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "5'primer"                                           (iii) HYPOTHETICAL: NO                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                       CCTTCTTCAAGGACCTCGTTCCCTCCG27                                                 (2) INFORMATION FOR SEQ ID NO:10:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 46 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "fragment of glucoamylase                            gene"                                                                         (iii) HYPOTHETICAL: NO                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                      CCACCTCCCTCGCCTTCTTCAAGGACCTCGTTCCCTCCGTCAGCAC46                              (2) INFORMATION FOR SEQ ID NO:11:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 32 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "3'primer at the Mro I                               site"                                                                         (iii) HYPOTHETICAL: NO                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                      GCTGAGAACTACTATAGGCCTCCATTTAGCGG32                                            (2) INFORMATION FOR SEQ ID NO:12:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 49 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "fragment of glucoamylase                            gene"                                                                         (iii) HYPOTHETICAL: NO                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                      TACTGCGACTCTTGATGATACGTGGAGGTAAATCGCTTGCTTCGTACTA49                           (2) INFORMATION FOR SEQ ID NO:13:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 32 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (A) DESCRIPTION: /desc = "3'primer"                                           (iii) HYPOTHETICAL: NO                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                      CGCTAAATGGAGGCCTATAGTAGTTCTCAGCG32                                            (2) INFORMATION FOR SEQ ID NO:14:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 626 amino acids                                                   (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (iii) HYPOTHETICAL: NO                                                        (v) FRAGMENT TYPE:                                                            (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                      MetHisLeuValSerSerLeuLeuValValGlyAlaAlaPheGlnAla                              151015                                                                        ValLeuGlyLeuProAspProLeuHisGluLysArgHisSerAspIle                              202530                                                                        IleLysArgSerValAspSerTyrIleGlnThrGluThrProIleAla                              354045                                                                        GlnLysAsnLeuLeuCysAsnIleGlyAlaSerGlyCysArgAlaSer                              505560                                                                        GlyAlaAlaSerGlyValValValAlaSerProSerLysSerSerPro                              65707580                                                                      AspTyrTrpTyrThrTrpThrArgAspAlaAlaLeuValThrLysLeu                              859095                                                                        IleValAspGluPheThrAsnAspTyrAsnThrThrLeuGlnAsnThr                              100105110                                                                     IleGlnAlaTyrAlaAlaAlaGlnAlaLysLeuGlnGlyValSerAsn                              115120125                                                                     ProSerGlySerLeuSerAsnGlyAlaGlyLeuGlyGluProLysPhe                              130135140                                                                     MetValAspLeuGlnGlnPheThrGlyAlaTrpGlyArgProGlnArg                              145150155160                                                                  AspGlyProProLeuArgAlaIleAlaLeuIleGlyTyrGlyLysTrp                              165170175                                                                     LeuValSerAsnGlyTyrAlaAspThrAlaLysSerIleIleTrpPro                              180185190                                                                     IleValLysAsnAspLeuAlaTyrThrAlaGlnTyrTrpAsnAsnThr                              195200205                                                                     GlyPheAspLeuTrpGluGluValAsnSerSerSerPhePheThrIle                              210215220                                                                     AlaAlaSerHisArgAlaLeuValGluGlySerAlaPheAlaLysSer                              225230235240                                                                  ValGlySerSerCysSerAlaCysAspAlaIleAlaProGlnIleLeu                              245250255                                                                     CysPheGlnGlnSerPheTrpSerAsnSerGlyTyrIleIleSerAsn                              260265270                                                                     PheValAsnTyrArgSerGlyLysAspIleAsnSerValLeuThrSer                              275280285                                                                     IleHisAsnPheAspProAlaAlaGlyCysAspValAsnThrPheGln                              290295300                                                                     ProCysSerAspArgAlaLeuAlaAsnHisLysValValValAspSer                              305310315320                                                                  MetArgPheTrpGlyValAsnSerGlyArgThrAlaGlyLysAlaAla                              325330335                                                                     AlaValGlyArgTyrAlaGluAspValTyrTyrAsnGlyAsnProTrp                              340345350                                                                     TyrLeuAlaThrLeuAlaAlaAlaGluGlnLeuTyrAspAlaValTyr                              355360365                                                                     ValTrpLysLysGlnGlySerIleThrValThrSerThrSerLeuAla                              370375380                                                                     PhePheLysAspLeuValProSerValSerThrGlyThrTyrSerSer                              385390395400                                                                  SerSerSerThrTyrThrAlaIleIleAsnAlaValThrThrTyrAla                              405410415                                                                     AspGlyPheValAspIleValAlaGlnTyrThrProSerAspGlySer                              420425430                                                                     LeuAlaGluGlnPheAspLysAspSerGlyAlaProLeuSerAlaThr                              435440445                                                                     HisLeuThrTrpSerTyrAlaSerPheLeuSerAlaAlaAlaArgArg                              450455460                                                                     AlaGlyIleValProProSerTrpGlyAlaAlaSerAlaAsnSerLeu                              465470475480                                                                  ProGlySerCysSerAlaSerThrValAlaGlySerTyrAlaThrAla                              485490495                                                                     ThrAlaThrSerPheProAlaAsnLeuThrProAlaSerThrThrVal                              500505510                                                                     ThrProProThrGlnThrGlyCysAlaAlaAspHisGluValLeuVal                              515520525                                                                     ThrPheAsnGluLysValThrThrSerTyrGlyGlnThrValLysVal                              530535540                                                                     ValGlySerIleAlaArgLeuGlyAsnTrpAlaProAlaSerGlyLeu                              545550555560                                                                  ThrLeuSerAlaLysGlnTyrSerSerSerAsnProLeuTrpSerThr                              565570575                                                                     ThrIleAlaLeuProGlnGlyThrSerPheLysTyrLysTyrValVal                              580585590                                                                     ValAsnSerAspGlySerValLysTrpGluAsnAspProAspArgSer                              595600605                                                                     TyrAlaValGlyThrAspCysAlaSerThrAlaThrLeuAspAspThr                              610615620                                                                     TrpArg                                                                        625                                                                           __________________________________________________________________________

We claim:
 1. A construct including at least the regulated promotersequence of the gene encoding the protein glucoamylase of Neurosporacrassa and further including at least one restriction site whereby acoding sequence for a heterologous peptide can be introduced into thegene so a heterologous peptide is expressed under the control of saidpromoter sequence.
 2. A construct according to claim 1 wherein thepromoter sequence includes an upstream activator comprising the TATA boxat position -101 with respect to the start codon ATG shown in FIG. 1(SEQ ID NO 6).
 3. A construct according to claim 1 comprising plasmidpPS8.
 4. A construct according to claim 1 comprising the plasmidpGla-Xho I.
 5. A construct according to claim 1 comprising the plasmidpGla-Mro I.
 6. A construct according to claim 1 comprising the plasmidpGE.
 7. A construct according to claim 1 comprising the plasmid pGla XL.8. A construct according to claim 1 comprising the plasmid pGla XLX. 9.A construct according to claim 1 comprising the plasmid pGLa MXLX.
 10. Aconstruct according to claim 1 comprising the plasmid pGS.
 11. Theregulated promoter of claim 1 which has the DNA sequence structure shownin FIG. 1 (SEQ ID NO:6).
 12. A construct according to claim 1, whereinthe promoter sequence includes an upstream activator comprising the CAATbox at position -133 with respect to the start codon ATG shown in FIG. 1(SEQ ID NO. 6).
 13. A construct according to claim 12 wherein thepromoter sequence comprises the first 938 nucleotides of the DNAsequence structure shown in FIG. 1 (SEQ ID NO: 6).
 14. A constructaccording to claim 1 wherein the construct further includes aNeurospora-selectable marker.
 15. A construct according to claim 14wherein said marker provides hygromycin-resistance.
 16. A constructaccording to claim 15 wherein said construct further comprises an E.coli-selectable marker.
 17. A construct according to claim 16 whereinsaid marker is a gene encoding ampicillin-resistance.
 18. A constructaccording to claim 1 wherein the construct further comprises DNAsequence structure encoding a secretion signal sequence.
 19. A constructaccording to claim 18 wherein said DNA sequence structure istranslationally fused to the codon sequence of a heterologous peptide.20. A method for transforming filamentous fungus comprising:(a)introducing the DNA construct of claim 1, linked to a selectable marker,into a filamentous fungal cell; and (b) culturing said filamentousfungal cell in the presence of an agent to which said selectable markerconfers resistance; and (c) monitoring resistant colonies of saidfilamentous fungal cells for the production of a heterologouspolypeptide; (d) wherein said filamentous fungal cells are transformed.21. A method according to claim 20 herein said filamentous fungus isNeurospora crassa.
 22. A method for the production of a pre-selectedheterologous peptide from at least one filamentous fungus comprising:a)providing a construct in accordance with claim 20 modified to include agene encoding a heterologous peptide, b) transforming a pre-selectedspecies of filamentous fungus with said construct, c) culturing saidtransformed fungus; and d) harvesting the heterologous peptide providedin said construct.
 23. A method according to claim 22 wherein saidfilamentous fungus is Neurospora crassa.
 24. A filamentous fungi havinginserted therein a construct in accordance with claim
 1. 25. Afilamentous fungi according to claim 24 wherein said fungi is Neurosporacrassa.
 26. A primer comprising at least one of the sequence structuresshown in FIG. 4, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11,SEQ ID NO:12 or SEQ ID NO:13.